Goto

Collaborating Authors

 posterior process




SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations

Bartosh, Grigory, Vetrov, Dmitry, Naesseth, Christian A.

arXiv.org Machine Learning

The Latent Stochastic Differential Equation (SDE) is a powerful tool for time series and sequence modeling. However, training Latent SDEs typically relies on adjoint sensitivity methods, which depend on simulation and backpropagation through approximate SDE solutions, which limit scalability. In this work, we propose SDE Matching, a new simulation-free method for training Latent SDEs. Inspired by modern Score- and Flow Matching algorithms for learning generative dynamics, we extend these ideas to the domain of stochastic dynamics for time series and sequence modeling, eliminating the need for costly numerical simulations. Our results demonstrate that SDE Matching achieves performance comparable to adjoint sensitivity methods while drastically reducing computational complexity.


Neural Structure Learning with Stochastic Differential Equations

Wang, Benjie, Jennings, Joel, Gong, Wenbo

arXiv.org Machine Learning

Time-series data are ubiquitous in the real world, often comprising a series of data points recorded at varying time intervals. Understanding the underlying structures between variables associated with temporal processes is of paramount importance for numerous real-world applications (Spirtes et al., 2000; Berzuini et al., 2012; Peters et al., 2017). Although randomised experiments are considered the gold standard for unveiling such relationships, they are frequently hindered by factors such as cost and ethical concerns. Structure learning seeks to infer hidden structures from purely observational data, offering a powerful approach for a wide array of applications (Bellot et al., 2021; Löwe et al., 2022; Runge, 2018; Tank et al., 2021; Pamfil et al., 2020; Gong et al., 2022). However, many existing structure learning methods for time series are inherently discrete, assuming that the underlying temporal processes are discretized in time and requiring uniform sampling intervals throughout the entire time range. Consequently, these models face two key limitations: (i) they may misrepresent the true underlying process when it is continuous in time, potentially leading to incorrect inferred relationships; and (ii) they struggle with handling irregular sampling intervals, which frequently arise in fields such as biology (Trapnell et al., 2014; Qiu et al., 2017; Qian et al., 2020) and climate science (Bracco et al., 2018; Raia, 2008). Although there exists a previous work (Bellot et al., 2021) that also tries to infer the underlying structure from the continuous-time perspective, its framework based on ordinary differential equations (ODE)


Variational Gaussian Process Diffusion Processes

Verma, Prakhar, Adam, Vincent, Solin, Arno

arXiv.org Machine Learning

Diffusion processes are a class of stochastic differential equations (SDEs) providing a rich family of expressive models that arise naturally in dynamic modelling tasks. Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems. We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach. We propose an alternative parameterization of the Gaussian variational process using a site-based exponential family description. This allows us to trade a slow inference algorithm with fixed-point iterations for a fast algorithm for convex optimization akin to natural gradient descent, which also provides a better objective for learning model parameters.


Scale invariant process regression: Towards Bayesian ML with minimal assumptions

Wieler, Matthias

arXiv.org Artificial Intelligence

Current methods for regularization in machine learning require quite specific model assumptions (e.g. a kernel shape) that are not derived from prior knowledge about the application, but must be imposed merely to make the method work. We show in this paper that regularization can indeed be achieved by assuming nothing but invariance principles (w.r.t. scaling, translation, and rotation of input and output space) and the degree of differentiability of the true function. Concretely, we derive a novel (non-Gaussian) stochastic process from the above minimal assumptions, and we present a corresponding Bayesian inference method for regression. The mean posterior turns out to be a polyharmonic spline, and the posterior process is a mixture of t-processes. Compared with Gaussian process regression, the proposed method shows equal performance and has the advantages of being (i) less arbitrary (no choice of kernel) (ii) potentially faster (no kernel parameter optimization), and (iii) having better extrapolation behavior. We believe that the proposed theory has central importance for the conceptual foundations of regularization and machine learning and also has great potential to enable practical advances in ML areas beyond regression.


Continuous-time Particle Filtering for Latent Stochastic Differential Equations

Deng, Ruizhi, Mori, Greg, Lehrmann, Andreas M.

arXiv.org Artificial Intelligence

Particle filtering is a standard Monte-Carlo approach for a wide range of sequential inference tasks. The key component of a particle filter is a set of particles with importance weights that serve as a proxy of the true posterior distribution of some stochastic process. In this work, we propose continuous latent particle filters, an approach that extends particle filtering to the continuous-time domain. We demonstrate how continuous latent particle filters can be used as a generic plug-in replacement for inference techniques relying on a learned variational posterior. Our experiments with different model families based on latent neural stochastic differential equations demonstrate superior performance of continuous-time particle filtering in inference tasks like likelihood estimation and sequential prediction for a variety of stochastic processes.


Continuous Latent Process Flows

Deng, Ruizhi, Brubaker, Marcus A., Mori, Greg, Lehrmann, Andreas M.

arXiv.org Machine Learning

Partial observations of continuous time-series dynamics at arbitrary time stamps exist in many disciplines. Fitting this type of data using statistical models with continuous dynamics is not only promising at an intuitive level but also has practical benefits, including the ability to generate continuous trajectories and to perform inference on previously unseen time stamps. Despite exciting progress in this area, the existing models still face challenges in terms of their representational power and the quality of their variational approximations. We tackle these challenges with continuous latent process flows (CLPF), a principled architecture decoding continuous latent processes into continuous observable processes using a time-dependent normalizing flow driven by a stochastic differential equation. To optimize our model using maximum likelihood, we propose a novel piecewise construction of a variational posterior process and derive the corresponding variational lower bound using trajectory re-weighting. Our ablation studies demonstrate the effectiveness of our contributions in various inference tasks on irregular time grids. Comparisons to state-of-the-art baselines show our model's favourable performance on both synthetic and real-world time-series data.


Moment-Based Variational Inference for Markov Jump Processes

Wildner, Christian, Koeppl, Heinz

arXiv.org Machine Learning

We propose moment-based variational inference as a flexible framework for approximate smoothing of latent Markov jump processes. The main ingredient of our approach is to partition the set of all transitions of the latent process into classes. This allows to express the Kullback-Leibler divergence between the approximate and the exact posterior process in terms of a set of moment functions that arise naturally from the chosen partition. To illustrate possible choices of the partition, we consider special classes of jump processes that frequently occur in applications. We then extend the results to parameter inference and demonstrate the method on several examples.


Learning interpretable continuous-time models of latent stochastic dynamical systems

Duncker, Lea, Bohner, Gergo, Boussard, Julien, Sahani, Maneesh

arXiv.org Machine Learning

We develop an approach to learn an interpretable semi-parametric model of a latent continuous-time stochastic dynamical system, assuming noisy high-dimensional outputs sampled at uneven times. The dynamics are described by a nonlinear stochastic differential equation (SDE) driven by a Wiener process, with a drift evolution function drawn from a Gaussian process (GP) conditioned on a set of learnt fixed points and corresponding local Jacobian matrices. This form yields a flexible nonparametric model of the dynamics, with a representation corresponding directly to the interpretable portraits routinely employed in the study of nonlinear dynamical systems. The learning algorithm combines inference of continuous latent paths underlying observed data with a sparse variational description of the dynamical process. We demonstrate our approach on simulated data from different nonlinear dynamical systems.